PSLC DataShop: A Data Analysis Service for the Learning Science Community
نویسندگان
چکیده
The Pittsburgh Science of Learning Center’s DataShop is an open data repository and set of associated visualization and analysis tools. DataShop has data from thousands of students deriving from interactions with on-line course materials and intelligent tutoring systems. The data is fine-grained, with student actions recorded roughly every 20 seconds, and it is longitudinal, spanning semester or yearlong courses. Currently over 188 datasets are stored including over 42 million student actions and over 150,000 student hours of data. Most student actions are “coded” meaning they are not only graded as correct or incorrect, but are categorized in terms of the hypothesized competencies or knowledge components needed to perform that action. DataShop provides a number of features to facilitate data analysis including a data schema that allows researchers to import data into DataShop or export data from the repository in order to perform additional analysis. DataShop offers a number of online analysis tools to perform functions, such as visualizing student performance and analyzing learning curves. Researchers can export cognitive models, make changes, and upload the changed model for further analysis. One new feature that has been added to DataShop is an easy-to-use API for using web services to access the repository. These web services allow developers to identify data sets in the repository and directly export data from them at the transaction or student step level. In the near future, developers will be able to add new fields back into the repository with the use of our web services for custom fields. Researchers have analyzed these data to better understand student cognitive and affective states and the results have been used to redesign instruction and demonstrably improve student learning [1]. Researchers can find out more and sign up for access to DataShop from our website: http://pslcdatashop.org
منابع مشابه
What's in a Step? Toward General, Abstract Representations of Tutoring System Log Data
The Pittsburgh Science of Learning Center (PSLC) is developing a data storage and analysis facility, called DataShop. It currently handles log data from 6 full-year tutoring systems and dozens of smaller, experimental tutoring systems. DataShop requires a representation of log data that supports a variety of tutoring systems, atheoretical analyses and theoretical analyses. The theorybased analy...
متن کاملLearnLab's DataShop: A Data Repository and Analytics Tool Set for Cognitive Science
In What Should Be the Data Sharing Policy of Cognitive Science? Pitt and Tang (2013) make the case for an open data-sharing policy in Cognitive Science and highlight the use of online data repositories to store and share raw research data. One such data repository is the LearnLab DataShop (http://pslcdatashop.org) hosted at Carnegie Mellon University. DataShop is part of LearnLab, a NSF-funded ...
متن کاملAnalyzing Student Inquiry Data Using Process Discovery and Sequence Classification
This paper reports on results of applying process discovery mining and sequence classification mining techniques to a data set of semi-structured learning activities. The main research objective is to advance educational data mining to model and support self-regulated learning in heterogeneous environments of learning content, activities, and social networks. As an example of our current resear...
متن کاملData Sharing: Low-Cost Sensors for Affect and Cognition
The Educational Data Mining (EDM) community has experienced many benefits from the open sharing of data. Efforts such as the Pittsburgh Science of Learning Center Datashop have helped in the development of learning data storage and standards in the educational community. In other fields, standards of comparison have been created through publication, sharing, and competition on identical dataset...
متن کاملA Probabilistic Model for Knowledge Component Naming
Recent years have seen significant advances in automatic identification of the Q-matrix necessary for cognitive diagnostic assessment. As data-driven approaches are introduced to identify latent knowledge components (KC) based on observed student performance, it becomes crucial to describe and interpret these latent KCs. We address the problem of naming knowledge components using keyword automa...
متن کامل